Target: IEC224 Phosphoserine phosphatase
NCBI Gene #: 11915261
Protein ID#: YP_005334106.1 (Used Dax's Oligo Design to order primers. Couldn't find Will's DNA Works Output on GDocs. Julia's first primer is small..? - Dr. B 090313)

organism: Vibrio cholerae
etiologic risk group: Appendix B-II-A. Risk Group 2 (RG2) - Bacterial Agents Including Chlamydia

Background/Disease Information (sort of like the Intro to your Mini Research Write up):
Vibrio cholerae is a bacteria that can cause a disease known as cholera depending on the strain encountered. The disease was first discovered in 1817 when it was spread by using trading routes. Several pandemics have occured through the mass spreading of the bacteria. In addition to several antibacterial treatments for extreme cases of the disease, vaccines are available to prevent transmission. However, the main treatment used is a process called oral rehydration therapy.

Cholera is spread when a host consumes food or drink that has been contaminated with the waste of an infected person. Some of the main symptoms include diarrhea and vomiting. Symptoms typically occur between one and five days from becoming infected. Water is usually the cause of cholera in developing countries and although rare, the cause of cholera in developed regions of the world is seafood particularly plankton and shellfish. The nickname for this disease is "blue death" due to it causing blue rashes on the body of an infected person. Currently, there are around 3-5 million people experiencing cholera worldwide.

To become infected, the average ingestion of the bacteria must be at around 100 million. Of course this number is lowered in people with weak immune systems and certain diseases. Some of the most infected parts of the world include India, Haiti, certain parts of Africa, and a few others.
.


Essentiality of this protein: essential
Complex of proteins?: works alone with current information available
Druggable Target: yes

*EC#: 3.1.3.3
Link to BRENDA EC# page: http://www.brenda-enzymes.org/php/result_flat.php4?ecno=3.1.3.3

-- Show screenshot of BRENDA enzyme mechanism schematic
sfd.PNG

Enzyme Assay information (spectrophotometric, coupled assay ?, reagents):
-- link to Sigma (or other company) page for assay or assay reagents (substrates)
http://www.sigmaaldrich.com/catalog/search?interface=All&term=phosphoserine&lang=en®ion=US&focus=product&N=0+220003048+219853269+219853286
-- link (or citation) to paper that contains assay information
http://www.sciencedirect.com.ezproxy.lib.utexas.edu/science/article/pii/S0014299997013046?np=y

-- List cost and quantity of substrate reagents and supplier
phosphoserine 63.50
Structure Available (PDB or Homology model)
http://www.rcsb.org/pdb/explore/explore.do?structureId=3N28

Current Inhibitors: none found
Expression Information (has it been expressed in bacterial cells): under Brenda, this enzyme has been expressed in porphyromonas gingivalis
Purification Method:
http://www.sciencedirect.com.ezproxy.lib.utexas.edu/science/article/pii/S0014299997013046?np=y
Image of protein (PyMol with features delineated and shown separately):
3n28_bio_r_500.jpg
*Amino Acid Sequence (paste as text only - not as screenshot or as 'code'):
MSLDALTTLPIKKHTALLNRFPETRFVTQLAKKRASWIVFGHYLTPAQFEDMDFFTNRFNAILDMWKVGRYEVALMDGEL TSEHETILKALELDYARIQDVPDLTKPGLIVLDMDSTAIQIECIDEIAKLAGVGEEVAEVTERAMQGELDFEQSLRLRVS KLKDAPEQILSQVRETLPLMPELPELVATLHAFGWKVAIASGGFTYFSDYLKEQLSLDYAQSNTLEIVSGKLTGQVLGEV VSAQTKADILLTLAQQYDVEIHNTVAVGDGANDLVMMAAAGLGVAYHAKPKVEAKAQTAVRFAGLGGVVCILSAALVAQQ KLSWKSKEGHHHHHH

Julia C.
Dr. B said that we should use the NCBI sequence instead of the PDB one. Here's the 'correct' filtered through aa sequence:
mdalttlpikkhtallnrfpetrfvtqlakkraswivfghyltpaqfedmdfftnrfnai ldmwkvgryevalmdgeltsehetilkaleldyariqdvpdltkpglivldmdstaiqie cideiaklagvgeevaevteramqgeldfeqslrlrvsklkdapeqilsqvretlplmpe lpelvatlhafgwkvaiasggftyfsdylkeqlsldyaqsntleivsgkltgqvlgevvs aqtkadilltlaqqydveihntvavgdgandlvmmaaaglgvayhakpkveakaqtavrf aglggvvcilsaalvaqqklswkskp


*length of your protein in Amino Acids: 335 ( Julia C.: 326)
Molecular Weight of your protein in kiloDaltons using the Expasy ProtParamwebsite: 36981.5 Da
Molar Extinction coefficient of your protein at 280 nm wavelength: 34045
TMpred graph Image (http://www.ch.embnet.org/software/TMPRED_form.html). Input your amino acid sequence to it.
TMPRED.8915.3466.gif
*CDS Gene Sequence (paste as text only):
atgagcctggatgcgctgaccaccctgccgattaaaaaacataccgcgctgctgaaccgc tttccggaaacccgctttgtgacccagctggcgaaaaaacgcgcgagctggattgtgttt ggccattatctgaccccggcgcagtttgaagatatggatttttttaccaaccgctttaac gcgattctggatatgtggaaagtgggccgctatgaagtggcgctgatggatggcgaactg accagcgaacatgaaaccattctgaaagcgctggaactggattatgcgcgcattcaggat gtgccggatctgaccaaaccgggcctgattgtgctggatatggatagcaccgcgattcag attgaatgcattgatgaaattgcgaaactggcgggcgtgggcgaagaagtggcggaagtg accgaacgcgcgatgcagggcgaactggattttgaacagagcctgcgcctgcgcgtgagc aaactgaaagatgcgccggaacagattctgagccaggtgcgcgaaaccctgccgctgatg ccggaactgccggaactggtggcgaccctgcatgcgtttggctggaaagtggcgattgcg agcggcggctttacctattttagcgattatctgaaagaacagctgagcctggattatgcg cagagcaacaccctggaaattgtgagcggcaaactgaccggccaggtgctgggcgaagtg gtgagcgcgcagaccaaagcggatattctgctgaccctggcgcagcagtatgatgtggaa attcataacaccgtggcggtgggcgatggcgcgaacgatctggtgatgatggcggcggcg ggcctgggcgtggcgtatcatgcgaaaccgaaagtggaagcgaaagcgcagaccgcggtg cgctttgcgggcctgggcggcgtggtgtgcattctgagcgcggcgctggtggcgcagcag aaactgagctggaaaagcaaagaaggccatcatcatcatcatcat

Julia C.
The gene sequence I found:
1 ATGGACGCGCTGACCACGCTGCCGATCAAAAAACACACCGCGCTCCTGAACCGTTTCCCG
61 GAAACCCGTTTCGTTACGCAACTGGCGAAGAAACGCGCGAGCTGGATCGTTTTCGGTCAC
121 TATCTGACCCCGGCACAGTTTGAAGATATGGACTTCTTCACCAATCGCTTTAATGCCATC
181 CTCGATATGTGGAAAGTGGGTCGTTATGAGGTTGCGCTCATGGACGGTGAACTCACCTCT
241 GAACACGAAACCATTCTGAAGGCGCTGGAACTCGATTACGCACGTATCCAGGACGTTCCG
301 GACCTCACCAAACCGGGTCTCATCGTTCTGGACATGGATTCTACCGCGATTCAGATCGAA
361 TGCATCGACGAAATCGCGAAGCTGGCGGGTGTCGGTGAGGAAGTTGCGGAAGTTACCGAA
421 CGTGCTATGCAGGGCGAACTGGATTTCGAACAGTCTCTGCGTCTCCGTGTTTCTAAACTG
481 AAGGATGCACCGGAACAGATCCTGAGCCAAGTTCGTGAAACCCTGCCGCTGATGCCGGAA
541 CTGCCAGAGCTCGTTGCGACCCTGCACGCATTCGGTTGGAAGGTAGCAATCGCCTCCGGT
601 GGTTTTACCTACTTCAGCGACTACCTGAAAGAGCAGCTCTCTCTGGACTATGCGCAGTCT
661 AACACCCTCGAAATTGTTTCTGGTAAACTCACTGGTCAGGTTCTCGGTGAAGTTGTCTCC
721 GCGCAGACCAAAGCGGACATCCTGCTGACGCTCGCCCAGCAGTATGACGTTGAAATTCAC
781 AACACCGTTGCGGTTGGCGATGGCGCGAACGACCTGGTTATGATGGCGGCTGCGGGCCTG
841 GGTGTGGCCTACCACGCGAAACCGAAAGTCGAAGCAAAGGCGCAAACGGCGGTTCGTTTT
901 GCGGGTCTGGGTGGTGTCGTTTGTATTCTGTCTGCCGCGCTGGTGGCGCAGCAAAAACTG
961 TCTTGGAAATCTAAACCGTAA

Dax:
This is the gene sequence I found, and what was used for primer tail design and ordering. It has a 100% query coverage and 88% match to Julia's sequence, but when put in an amino acid sequence, the two match at 100% for both query coverage and identity.

ATGGACGCGCTGACCACCCTCCCGATCAAAAAGCACACCGCGCTGCTGAACCGTTTC CCGGAAACCCGCTTCGTTACCCAACTGGCGAAAAAGCGTGCGTCTTGGATCGTTTTCGGTCAC
TACCTCACTCCAGCACAGTTTGAAGATATGGATTTTTTCACCAATCGTTTCAATGCGATC
CTGGACATGTGGAAAGTTGGCCGTTACGAAGTTGCGCTGATGGACGGTGAACTGACCTCT
GAACACGAAACCATCCTGAAAGCGCTGGAACTCGACTACGCTCGCATCCAGGACGTTCCA
GACCTCACCAAACCGGGCCTGATCGTTCTCGACATGGACTCTACCGCTATCCAGATCGAA
TGCATCGACGAAATTGCGAAGCTGGCGGGTGTTGGCGAGGAAGTGGCCGAAGTTACGGAA
CGTGCGATGCAGGGCGAGCTGGACTTCGAACAGTCTCTGCGTCTGCGTGTTTCTAAACTC
AAAGACGCCCCTGAACAGATCCTGAGCCAGGTTCGTGAAACGCTGCCGCTCATGCCTGAA
CTGCCGGAACTGGTTGCGACCCTGCACGCGTTCGGTTGGAAGGTAGCAATCGCGTCTGGT
GGTTTCACCTACTTTTCTGACTACCTGAAGGAACAACTCAGCCTCGATTACGCGCAGTCT
AACACCCTGGAAATTGTTTCTGGTAAACTGACTGGTCAAGTTCTGGGTGAAGTTGTGTCT
GCTCAGACCAAAGCGGACATCCTGCTGACCCTGGCGCAACAGTACGACGTTGAAATCCAC
AACACCGTTGCGGTGGGTGACGGTGCGAACGACCTGGTTATGATGGCGGCTGCGGGCCTC
GGTGTAGCGTACCATGCGAAACCGAAGGTTGAGGCGAAGGCGCAGACCGCAGTTCGTTTC
GCTGGTCTCGGTGGTGTCGTTTGCATCCTGTCTGCGGCGCTCGTTGCGCAGCAAAAACTC
TCTTGGAAATCTAAACCGTAA

Oligo sequence:

1 ATGGACGCGCTGACCAC 17

2 CCGGGAAACGGTTCAGGAGCGCGGTGTGTTTTTTGATCGGCAGCGTGGTCAGCGCGTCCA 60

3 TCCTGAACCGTTTCCCGGAAACCCGTTTCGTTACGCAACTGGCGAAGAAACGCGCGAGCT 60

4 TCTTCAAACTGTGCCGGGGTCAGATAGTGACCGAAAACGATCCAGCTCGCGCGTTTCTTC 60

5 CCCGGCACAGTTTGAAGATATGGACTTCTTCACCAATCGCTTTAATGCCATCCTCGATAT 60

6 CGTCCATGAGCGCAACCTCATAACGACCCACTTTCCACATATCGAGGATGGCATTAAAGC 60

7 GGTTGCGCTCATGGACGGTGAACTCACCTCTGAACACGAAACCATTCTGAAGGCGCTGGA 60

8 TTGGTGAGGTCCGGAACGTCCTGGATACGTGCGTAATCGAGTTCCAGCGCCTTCAGAATG 60

9 CGTTCCGGACCTCACCAAACCGGGTCTCATCGTTCTGGACATGGATTCTACCGCGATTCA 60

10 ACACCCGCCAGCTTCGCGATTTCGTCGATGCATTCGATCTGAATCGCGGTAGAATCCATG 60

11 CGAAGCTGGCGGGTGTCGGTGAGGAAGTTGCGGAAGTTACCGAACGTGCTATGCAGGGCG 60

12 CAGTTTAGAAACACGGAGACGCAGAGACTGTTCGAAATCCAGTTCGCCCTGCATAGCACG 60

13 CGTCTCCGTGTTTCTAAACTGAAGGATGCACCGGAACAGATCCTGAGCCAAGTTCGTGAA 60

14 GTCGCAACGAGCTCTGGCAGTTCCGGCATCAGCGGCAGGGTTTCACGAACTTGGCTCAGG 60

15 CCAGAGCTCGTTGCGACCCTGCACGCATTCGGTTGGAAGGTAGCAATCGCCTCCGGTGGT 60

16 CCAGAGAGAGCTGCTCTTTCAGGTAGTCGCTGAAGTAGGTAAAACCACCGGAGGCGATTG 60

17 GAAAGAGCAGCTCTCTCTGGACTATGCGCAGTCTAACACCCTCGAAATTGTTTCTGGTAA 60

18 GCGCGGAGACAACTTCACCGAGAACCTGACCAGTGAGTTTACCAGAAACAATTTCGAGGG 60

19 GTGAAGTTGTCTCCGCGCAGACCAAAGCGGACATCCTGCTGACGCTCGCCCAGCAGTATG 60

20 TTCGCGCCATCGCCAACCGCAACGGTGTTGTGAATTTCAACGTCATACTGCTGGGCGAGC 60

21 TGGCGATGGCGCGAACGACCTGGTTATGATGGCGGCTGCGGGCCTGGGTGTGGCCTACCA 60

22 ACGAACCGCCGTTTGCGCCTTTGCTTCGACTTTCGGTTTCGCGTGGTAGGCCACACCCAG 60

23 GCAAACGGCGGTTCGTTTTGCGGGTCTGGGTGGTGTCGTTTGTATTCTGTCTGCCGCGCT 60

24 TTACGGTTTAGATTTCCAAGACAGTTTTTGCTGCGCCACCAGCGCGGCAGACAGAAT 57

*GC% Content for gene: 56.3%
*CDS Gene Sequence (codon optimized) - copy from output of Primer Design Protocol (paste as text only):
*GC% Content for gene (codon optimized):

Do Not Need this info for Spring (but still copy these lines to your Target page for now)
Primer design results for pNIC-Bsa4 cloning (list seqeunces of all of your ~40 nt long primers):
(link to DNA Works output text file - that should be saved in your Google Docs folder after you did the primer design protocol)
-- Ask a mentor, Dr. B, or a fellow researcher -how to link a GDocs file if you are not sure how to.

Primer design results for 'tail' primers (this is just 2 sequences):
Upstream:
TACTTCCAATCCATGGACGCGCTGACCA
Downstream (reverse complemented):
TATCCACCTTTACTGTTACGGTTTAGATTTCCAAGAGAG
*Note that this insert contains code that codes for a cut by BsaI.

We will be ordering from Shelby-Dax Garibay.




Resources:

See ProtocolTargetDiscoveryVDS.docx for more
Etiologic Risk Group Categories (for pathogens): http://www.utexas.edu/research/rsc/ibc/agent_class.html#_Toc7238334

Databases of genes/organisms:
http://www.niaid.nih.gov/Pages/default.aspx
http://eupathdb.org/eupathdb/
https://patricbrc.vbi.vt.edu/portal/portal/patric/Home
http://www.nmpdr.org/FIG/wiki/view.cgi/Main/EssentialGenes
http://tubic.tju.edu.cn/deg/
http://csgid.org/csgid/cake/pages/community_request_gateway
http://tdrtargets.org/
http://gsc.jcvi.org/status.shtml
Possible inhibitors:
http://www.ncbi.nlm.nih.gov/pubmed/9430431
Comparison of phosphoserine and amino acid binding in another bacteria:
http://www.rcsb.org/pdb/explore/explore.do?structureId=1L7P
Comparison of protein to other bacteria with cation in active site:
http://www.rcsb.org/pdb/explore.do?structureId=1L7M



Scientific Nomenclature page from Center for Disease Control (gene, protein names and abbreviations)
http://wwwnc.cdc.gov/eid/pages/scientific-nomenclature.htm


Gene Information:
NCBI GENE Page: http://www.ncbi.nlm.nih.gov/gene
BLAST Page: http://blast.ncbi.nlm.nih.gov/

Protein Information:
NCBI Protein Page: http://www.ncbi.nlm.nih.gov/protein
Protein Expression Website
Protein Expression Paper: SGC_ProteinProductionPurificationNatMethods2008.pdf

Primer Overlap PCR Articles
HooverLubkowski_PCRoverlapcloninggnf042.pdf
StemmerPCRoverlapGene1995.pdf

Is my target good for Virtual Screening programs?
Reynolds_THermodynamicsLigandBinding_MedChemLett2011.pdf